Loading ITables v2.2.4 from the init_notebook_mode cell...
(need help?)
Top 10 Artists by Track Count & POS Distributions
import pandas as pdimport seaborn as snsimport matplotlib.pyplot as plt# Load the full Spotify datasetspotify = pd.read_csv('https://bcdanl.github.io/data/spotify_all.csv')# 1. Find the top 10 most‐prolific artiststop10 = spotify['artist_name'].value_counts().nlargest(10).index.tolist()# 2. Subset and plot the POS distributions with violinsdf_top10 = spotify[spotify['artist_name'].isin(top10)]plt.figure(figsize=(12, 6))sns.violinplot( x='artist_name', y='pos', data=df_top10, cut=0, inner='quartile', palette='tab10')plt.xticks(rotation=45, ha='right')plt.title('POS Distribution for Top 10 Artists')plt.xlabel('Artist')plt.ylabel('Track Position (pos)')plt.tight_layout()plt.show()
/var/folders/gp/qrzfglvs0plg_zk9wtgwkzyr0000gn/T/ipykernel_75043/340212433.py:14: FutureWarning:
Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `x` variable to `hue` and set `legend=False` for the same effect.
sns.violinplot(
Interpretation:
Among the ten busiest artists, some—like the #1 artist—show very tight POS violins centered in the top‐20 (“early” positions), indicating consistent front‐loaded playlist placement. Others (e.g. the #4 and #7 artists) have much wider violins stretching into the bottom half, meaning their tracks appear across both early and late positions. Most artists fall between these extremes, with moderate variation in where their songs land.